Phoneme Recognition using Competitive Neural Trees
نویسندگان
چکیده
This paper applies the Competitive Neural Tree (CNeT) method to phoneme recognition, a pattern classiication problem. CNeTs combine the advantages of Decision Trees and Competitive Neural Networks. The CNeT algorithm works by hierarchically clustering given examples while growing a tree. Diierent search methods, as well as stopping and splitting criteria are discussed. The CNeT algorithm allows to use probability estimation and interpolation for the recall. Using a phoneme recognition task as benchmark, the CNeT performs as well as a MLP, but has a much lower time complexity. Phoneme recognition is one of the earlier steps in automated speech recognition. First the speech signal is recorded and digitized. The next step is feature extraction. The raw sample is divided into time frames and for each frame a characteristic set of coeecients { a feature vector { is computed. The phoneme recognition component processes a sequence of feature vectors to derive a pho-netic labeling of the speech signal. From the phoneme sequence more abstract representations, such as sequences of syllables, words, and sentences, are derived in subsequent steps. The phoneme recognition task consists in producing from a given sequence of feature vectors a phoneme label or a probability estimation of membership in a phoneme class. In this paper we introduce a phoneme recognition component that uses Competitive Neural Trees (CNeT). We compare it to a system using Multi-Layer Perceptrons (MLP) trained with the well known Back-Propagation algorithm and with a system using Nearest Neighbor (NN) classiication. The CNeT has a structured architecture. A hierarchy of identical nodes forms an m-ary tree. Each node contains m slots s 1 ; s 2 ; : : : ; s m and a counter age that is incremented each time an example is presented to the node. Each slot s i stores a prototype p i , a counter count, class counters count j , and a pointer to a node. The prototypes p i , that have the same length as the input vectors x, are trained to become centroids of example clusters R i. The class counters count j and count are incremented each time the prototype of the slot is updated for an example of class C j. A child node may be assigned to the slot via the pointer.
منابع مشابه
Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملبهبود عملکرد سیستم بازشناسی گفتار پیوسته بوسیله ویژگیهای استخراج شده از مانیفولدهای گفتاری در فضای بازسازی شده فاز
The design for new feature extraction methods out of the speech signal and combination of their obtained information is one of the most effective approaches to improve the performance of automatic speech recognition (ASR) system. Recent researches have been shown that the speech signal contains nonlinear and chaotic properties, but the effects of these properties are not used in the continuous ...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملIsolated Voiced Digit Recognition Using Inductive Inference
This paper proposes the use of inductive inference "decision trees" for isolated digit recognition. The aim of this research is to demonstrate that inductive learning can provide an alternative approach to existing automatic speech recognition techniques such as Dynamic Time Warping (DP), Hidden Markov Modelling (HMM) and Neural Networks (NN). The construction of the decision tree is based on C...
متن کاملError Analysis Using Decision Trees in Spontaneous Presentation Speech Recognition
This paper proposes the use of decision trees for analyzing errors in spontaneous presentation speech recognition. The trees are designed to predict whether a word or a phoneme can be correctly recognized or not, using word or phoneme attributes as inputs. The trees are constructed using training “cases” by choosing questions about attributes step by step according to the gain ratio criterion. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007